Professional Machine Learning Engineer v1.0

Page:    1 / 19   
Exam contains 285 questions

You need to train a computer vision model that predicts the type of government ID present in a given image using a GPU-powered virtual machine on Compute
Engine. You use the following parameters:
✑ Optimizer: SGD
✑ Image shape = 224ֳ—224
✑ Batch size = 64
✑ Epochs = 10
✑ Verbose =2
During training you encounter the following error: ResourceExhaustedError: Out Of Memory (OOM) when allocating tensor. What should you do?

  • A. Change the optimizer.
  • B. Reduce the batch size.
  • C. Change the learning rate.
  • D. Reduce the image shape.


Answer : B

Reference:
https://github.com/tensorflow/tensorflow/issues/136

You developed an ML model with AI Platform, and you want to move it to production. You serve a few thousand queries per second and are experiencing latency issues. Incoming requests are served by a load balancer that distributes them across multiple Kubeflow CPU-only pods running on Google Kubernetes Engine
(GKE). Your goal is to improve the serving latency without changing the underlying infrastructure. What should you do?

  • A. Significantly increase the max_batch_size TensorFlow Serving parameter.
  • B. Switch to the tensorflow-model-server-universal version of TensorFlow Serving.
  • C. Significantly increase the max_enqueued_batches TensorFlow Serving parameter.
  • D. Recompile TensorFlow Serving using the source to support CPU-specific optimizations. Instruct GKE to choose an appropriate baseline minimum CPU platform for serving nodes.


Answer : D

You have a demand forecasting pipeline in production that uses Dataflow to preprocess raw data prior to model training and prediction. During preprocessing, you employ Z-score normalization on data stored in BigQuery and write it back to BigQuery. New training data is added every week. You want to make the process more efficient by minimizing computation time and manual intervention. What should you do?

  • A. Normalize the data using Google Kubernetes Engine.
  • B. Translate the normalization algorithm into SQL for use with BigQuery.
  • C. Use the normalizer_fn argument in TensorFlowג€™s Feature Column API.
  • D. Normalize the data with Apache Spark using the Dataproc connector for BigQuery.


Answer : B

You need to design a customized deep neural network in Keras that will predict customer purchases based on their purchase history. You want to explore model performance using multiple model architectures, store training data, and be able to compare the evaluation metrics in the same dashboard. What should you do?

  • A. Create multiple models using AutoML Tables.
  • B. Automate multiple training runs using Cloud Composer.
  • C. Run multiple training jobs on AI Platform with similar job names.
  • D. Create an experiment in Kubeflow Pipelines to organize multiple runs.


Answer : C

You are developing a Kubeflow pipeline on Google Kubernetes Engine. The first step in the pipeline is to issue a query against BigQuery. You plan to use the results of that query as the input to the next step in your pipeline. You want to achieve this in the easiest way possible. What should you do?

  • A. Use the BigQuery console to execute your query, and then save the query results into a new BigQuery table.
  • B. Write a Python script that uses the BigQuery API to execute queries against BigQuery. Execute this script as the first step in your Kubeflow pipeline.
  • C. Use the Kubeflow Pipelines domain-specific language to create a custom component that uses the Python BigQuery client library to execute queries.
  • D. Locate the Kubeflow Pipelines repository on GitHub. Find the BigQuery Query Component, copy that componentג€™s URL, and use it to load the component into your pipeline. Use the component to execute queries against BigQuery.


Answer : A

You are building a model to predict daily temperatures. You split the data randomly and then transformed the training and test datasets. Temperature data for model training is uploaded hourly. During testing, your model performed with 97% accuracy; however, after deploying to production, the model's accuracy dropped to 66%. How can you make your production model more accurate?

  • A. Normalize the data for the training, and test datasets as two separate steps.
  • B. Split the training and test data based on time rather than a random split to avoid leakage.
  • C. Add more data to your test set to ensure that you have a fair distribution and sample for testing.
  • D. Apply data transformations before splitting, and cross-validate to make sure that the transformations are applied to both the training and test sets.


Answer : D

You are developing models to classify customer support emails. You created models with TensorFlow Estimators using small datasets on your on-premises system, but you now need to train the models using large datasets to ensure high performance. You will port your models to Google Cloud and want to minimize code refactoring and infrastructure overhead for easier migration from on-prem to cloud. What should you do?

  • A. Use AI Platform for distributed training.
  • B. Create a cluster on Dataproc for training.
  • C. Create a Managed Instance Group with autoscaling.
  • D. Use Kubeflow Pipelines to train on a Google Kubernetes Engine cluster.


Answer : C

You have trained a text classification model in TensorFlow using AI Platform. You want to use the trained model for batch predictions on text data stored in
BigQuery while minimizing computational overhead. What should you do?

  • A. Export the model to BigQuery ML.
  • B. Deploy and version the model on AI Platform.
  • C. Use Dataflow with the SavedModel to read the data from BigQuery.
  • D. Submit a batch prediction job on AI Platform that points to the model location in Cloud Storage.


Answer : A

You work with a data engineering team that has developed a pipeline to clean your dataset and save it in a Cloud Storage bucket. You have created an ML model and want to use the data to refresh your model as soon as new data is available. As part of your CI/CD workflow, you want to automatically run a Kubeflow
Pipelines training job on Google Kubernetes Engine (GKE). How should you architect this workflow?

  • A. Configure your pipeline with Dataflow, which saves the files in Cloud Storage. After the file is saved, start the training job on a GKE cluster.
  • B. Use App Engine to create a lightweight python client that continuously polls Cloud Storage for new files. As soon as a file arrives, initiate the training job.
  • C. Configure a Cloud Storage trigger to send a message to a Pub/Sub topic when a new file is available in a storage bucket. Use a Pub/Sub-triggered Cloud Function to start the training job on a GKE cluster.
  • D. Use Cloud Scheduler to schedule jobs at a regular interval. For the first step of the job, check the timestamp of objects in your Cloud Storage bucket. If there are no new files since the last run, abort the job.


Answer : C

You have a functioning end-to-end ML pipeline that involves tuning the hyperparameters of your ML model using AI Platform, and then using the best-tuned parameters for training. Hypertuning is taking longer than expected and is delaying the downstream processes. You want to speed up the tuning job without significantly compromising its effectiveness. Which actions should you take? (Choose two.)

  • A. Decrease the number of parallel trials.
  • B. Decrease the range of floating-point values.
  • C. Set the early stopping parameter to TRUE.
  • D. Change the search algorithm from Bayesian search to random search.
  • E. Decrease the maximum number of trials during subsequent training phases.


Answer : BD

Reference:
https://cloud.google.com/ai-platform/training/docs/hyperparameter-tuning-overview

Your team is building an application for a global bank that will be used by millions of customers. You built a forecasting model that predicts customers' account balances 3 days in the future. Your team will use the results in a new feature that will notify users when their account balance is likely to drop below $25. How should you serve your predictions?

  • A. 1. Create a Pub/Sub topic for each user. 2. Deploy a Cloud Function that sends a notification when your model predicts that a userג€™s account balance will drop below the $25 threshold.
  • B. 1. Create a Pub/Sub topic for each user. 2. Deploy an application on the App Engine standard environment that sends a notification when your model predicts that a userג€™s account balance will drop below the $25 threshold.
  • C. 1. Build a notification system on Firebase. 2. Register each user with a user ID on the Firebase Cloud Messaging server, which sends a notification when the average of all account balance predictions drops below the $25 threshold.
  • D. 1. Build a notification system on Firebase. 2. Register each user with a user ID on the Firebase Cloud Messaging server, which sends a notification when your model predicts that a userג€™s account balance will drop below the $25 threshold.


Answer : A

You work for an advertising company and want to understand the effectiveness of your company's latest advertising campaign. You have streamed 500 MB of campaign data into BigQuery. You want to query the table, and then manipulate the results of that query with a pandas dataframe in an AI Platform notebook.
What should you do?

  • A. Use AI Platform Notebooksג€™ BigQuery cell magic to query the data, and ingest the results as a pandas dataframe.
  • B. Export your table as a CSV file from BigQuery to Google Drive, and use the Google Drive API to ingest the file into your notebook instance.
  • C. Download your table from BigQuery as a local CSV file, and upload it to your AI Platform notebook instance. Use pandas.read_csv to ingest he file as a pandas dataframe.
  • D. From a bash cell in your AI Platform notebook, use the bq extract command to export the table as a CSV file to Cloud Storage, and then use gsutil cp to copy the data into the notebook. Use pandas.read_csv to ingest the file as a pandas dataframe.


Answer : C

Reference:
https://cloud.google.com/bigquery/docs/bigquery-storage-python-pandas

You are an ML engineer at a global car manufacture. You need to build an ML model to predict car sales in different cities around the world. Which features or feature crosses should you use to train city-specific relationships between car type and number of sales?

  • A. Thee individual features: binned latitude, binned longitude, and one-hot encoded car type.
  • B. One feature obtained as an element-wise product between latitude, longitude, and car type.
  • C. One feature obtained as an element-wise product between binned latitude, binned longitude, and one-hot encoded car type.
  • D. Two feature crosses as an element-wise product: the first between binned latitude and one-hot encoded car type, and the second between binned longitude and one-hot encoded car type.


Answer : C

You work for a large technology company that wants to modernize their contact center. You have been asked to develop a solution to classify incoming calls by product so that requests can be more quickly routed to the correct support team. You have already transcribed the calls using the Speech-to-Text API. You want to minimize data preprocessing and development time. How should you build the model?

  • A. Use the AI Platform Training built-in algorithms to create a custom model.
  • B. Use AutoMlL Natural Language to extract custom entities for classification.
  • C. Use the Cloud Natural Language API to extract custom entities for classification.
  • D. Build a custom model to identify the product keywords from the transcribed calls, and then run the keywords through a classification algorithm.


Answer : A

You are training a TensorFlow model on a structured dataset with 100 billion records stored in several CSV files. You need to improve the input/output execution performance. What should you do?

  • A. Load the data into BigQuery, and read the data from BigQuery.
  • B. Load the data into Cloud Bigtable, and read the data from Bigtable.
  • C. Convert the CSV files into shards of TFRecords, and store the data in Cloud Storage.
  • D. Convert the CSV files into shards of TFRecords, and store the data in the Hadoop Distributed File System (HDFS).


Answer : B

Reference:
https://cloud.google.com/dataflow/docs/guides/templates/provided-batch

Page:    1 / 19   
Exam contains 285 questions

Talk to us!


Have any questions or issues ? Please dont hesitate to contact us

Certlibrary.com is owned by MBS Tech Limited: Room 1905 Nam Wo Hong Building, 148 Wing Lok Street, Sheung Wan, Hong Kong. Company registration number: 2310926
Certlibrary doesn't offer Real Microsoft Exam Questions. Certlibrary Materials do not contain actual questions and answers from Cisco's Certification Exams.
CFA Institute does not endorse, promote or warrant the accuracy or quality of Certlibrary. CFA® and Chartered Financial Analyst® are registered trademarks owned by CFA Institute.
Terms & Conditions | Privacy Policy